Hybrid Machine Learning-Based Approach for Anomaly Detection using Apache Spark
نویسندگان
چکیده
Over the past few decades, volume of data has increased significantly in both scientific institutions and universities, with a large number students enrolled high related data. Furthermore, network traffic post-pandemic use online learning. Therefore, processing is complex challenging task that increases possibility intrusions anomalies. Traditional security systems cannot deal such high-speed big traffic. Real-time anomaly detection should be able to process as quickly possible detect abnormal malicious This paper proposes hybrid approach consisting supervised unsupervised learning for based on engine Apache Spark. Initially, k-means algorithm was implemented Sparks MLlib clustering traffic, then each cluster, K-nearest neighbors (KNN) classification detection. The proposed model trained validated against real dataset from Ibn Zohr University. results indicate outperformed other well-known algorithms detecting anomalies aforementioned dataset. experimental show can reach up 99.94 % accuracy using k-fold cross-validation method complete all 48 features.
منابع مشابه
MLlib: Machine Learning in Apache Spark
Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark’s open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shi...
متن کاملBenchmarking Apache Spark with Machine Learning Applications
We benchmarked Apache Spark with a popular parallel machine learning training application, Distributed Stochastic Gradient Descent for Matrix Factorization [5] and compared the Spark implementation with alternative approaches for communicating model parameters, such as scheduled pipelining using POSIX socket or MPI, and distributed shared memory (e.g. parameter server [13]). We found that Spark...
متن کاملA hybrid machine learning approach to network anomaly detection
Zero-day cyber attacks such as worms and spy-ware are becoming increasingly widespread and dangerous. The existing signature-based intrusion detection mechanisms are often not sufficient in detecting these types of attacks. As a result, anomaly intrusion detection methods have been developed to cope with such attacks. Among the variety of anomaly detection approaches, the Support Vector Machine...
متن کاملMachine Learning for Host-based Anomaly Detection
Machine Learning for Host-based Anomaly Detection by Gaurav Tandon Dissertation Advisor: Philip K. Chan, Ph.D. Anomaly detection techniques complement signature based methods for intrusion detection. Machine learning approaches are applied to anomaly detection for automated learning and detection. Traditional host-based anomaly detectors model system call sequences to detect novel attacks. This...
متن کاملA Hybrid Machine Learning Method for Intrusion Detection
Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Advanced Computer Science and Applications
سال: 2023
ISSN: ['2158-107X', '2156-5570']
DOI: https://doi.org/10.14569/ijacsa.2023.0140496